Network Models, Integer Programming & Nonlinear
We have already learned about how linear programming can help us to find our objective function. We also talked a little bit about how broadly useful it is to the sciences. One reason that it is so useful is that it can be extended to so many different domains.
| Vocab | Transportation | Communication | Water |
|---|---|---|---|
| Product | Buses, cars, etc. | Messages | Water |
| Nodes | Bus stops, Intersections | Relay stations | Lakes, reservoirs |
| Arcs/Edges | Streets | Communication channels | pipelines, rivers |
Let’s consider the following figure:
Each path through our network has an associate cost:
| from | to | cost |
|---|---|---|
| LA | Omaha | 1 |
| LA | Topeka | 7 |
| Omaha | Chicago | 2 |
| Omaha | Indianapolis | 7 |
| Omaha | Topeka | 5 |
| Topeka | Indianapolis | 3 |
| Chicago | Topeka | 1 |
| Chicago | Boston | 8 |
| Indianapolis | Chicago | 3 |
| Indianapolis | Boston | 2 |
Using linear programming to solve such network problems is called network flow programming and any such problem can be conveyed as a minimum-cost flow program. For the vast majority of our network problems, we are going to be minimizing our objective function. In our network flow, we have two main flows: inflows and outflows. These two flows are used to produce a sparse matrix of node-arc incidences. In a more straightforward approach, these inflows and outflows are used to satisfy the equation of \(\Sigma x_j - \Sigma x_j = 0\): this is called flow conservation. We can also define a source node by changing that equation to \(\Sigma x_j - \Sigma x_j = e\); alternatively, we can define a sink node with \(\Sigma x_j - \Sigma x_j = -e\)
The transportation problem is a classic problem for such network flow problems and we can easily see how we can convert it to a linear programming problem:
\[ \begin{align} M = 35_{x_{11}} + 30_{x_{12}} + 40_{x_{13}} + 32_{x_{14}} + 37_{x_{21}} + 40_{x_{22}} + \\ 42_{x_{23}} + 25_{x_{24}} + 40_{x_{31}} + 15_{x_{32}} + 20_{x_{33}} + 28_{x_{34}} \\ subject \, to \\ X_{11} + X_{12} + X_{13} + X_{14} \leq 1200 \\ X_{21} + X_{22} + X_{23} + X_{24} \leq 1000 \\ X_{31} + X_{32} + X_{33} + X_{34} \leq 800 \\ X_{11} + X_{21} + X_{31} \geq 1100 \\ X_{12} + X_{22} + X_{23} \geq 400 \\ X_{13} + X_{23} + X_{33} \geq 750 \\ X_{14} + X_{24} + X_{34} \geq 750 \\ X_{ij} \geq 0 \end{align} \]
We could solve the classic problem by hand without too much effort (it would just take a little bit of time). More modern problems, though, would be a bit tougher.
Look at this data:
We would be dealing with a large number of equations and people tend not to do too well with large tables of information.
Maybe we could do better with a visualization?
We have already seen integer constraints in our linear programming problems, but we really have not given them much thought beyond that we are forcing integers. While integer programs seem simple on their face, they are actually far more complex than our standard linear programming problems. This is further complicated by integer programming being divided into a few different families: namely mixed integer programming (MIP) and zero-one programming.
MIP problems suggest that some constraints are integer constraints and some are fractional. A zero-one programming problem is the same type of binary constraint that we discussed with our network models.
We are now in the space where we need to discuss issues related to P versus NP problems. If an answer can be obtained in polynomial time, then it is a P problem.
If a problem cannot be solved in polynomial time, but the solution can be verified in polynomial time, the problem is said to be NP.
Let’s transport ourselves back to the days of people riding the rails. If you have your trusty knapsack with you, you are faced with decisions about what to bring based upon the weight that you can carry. Let us say, for instance, that you have 10 objects with various weights and that you can carry any combination of objects so long as you do not exceed 20 pounds. Let’s also assume that you have a personal weight attached to each item, so that some items are more important to you than others. We have a problem where we are trying to fit the most important things to us into our bag, while being constrained by the weight of our bag. The knapsack problem is said to be NP-complete (essentially that no algorithm is both fast and always correct).
Let’s see what it looks like in notation:
\[ \begin{align} maximize\, \sum_{i=1}^nv_ix_i \\ subject\,to\, \sum_{i=1}^nw_ix_i \leq W, \\ x_{i} \in \{0,1\} \end{align} \]
Where n means as many items as possible, v is your value, w is the item weight, and W is the weight capacity.
While it looks simple, it is not. Gone are the days when we could fall back on that nice interior point graphical method we used for our linear programs (there is a graphical method, but it is far more complicated). We could also just go through all of the possible combinations (enumeration), but we are dealing with \(2^{10}=1024\) possible combinations of solutions.
Instead, many packages utilize branch and bound algorithms, where the zero-one constraint is relaxed to two continuous constraints (i.e, \(x\leq1\) and \(x\geq0\)). This then goes through a tree search, but that is beyond the scope of our engagement.
Let’s see a few different ways of handling the problem in R.
library(ROI)
library(ROI.plugin.glpk)
n = 10 # Setting our number of variables
W = 20 # Our max weight
v = round(runif(n, min = 1, max = 5)) # A random assignment of our values
v
[1] 3 2 4 5 3 2 2 3 3 4
w = runif(n, min = 1, max = 7) # Random weights
w
[1] 2.364618 1.585276 3.584392 2.694829 2.127758 2.303139 1.343175
[8] 1.406469 1.926161 1.597068
constraints = L_constraint(w, "<=", W) # Creating the constraints
model = OP(objective = v,
constraints = constraints,
bounds = V_bound(li = 1:n, lb = rep.int(0, n), ui = 1:n, ub = rep.int(1, n)),
types = rep.int("B", n),
maximum = TRUE)
model
ROI Optimization Problem:
Maximize a linear objective function of length 10 with
- 10 binary objective variables,
subject to
- 1 constraint of type linear.
- 0 lower and 10 upper non-standard variable bounds.
Before we go on to solving the problem, let’s discuss that bounds argument for a minute. We put a function into the bounds argument: V_bound(). The V_bound function takes li and ui, which specify the variables for which we are setting bounds, and lb and ub, which specify the lower and upper bounds of those variables (0, 1).
We can now solve the problem:
res = ROI_solve(model, "glpk", verbose = TRUE)
<SOLVER MSG> ----
GLPK Simplex Optimizer, v4.47
1 row, 10 columns, 10 non-zeros
* 0: obj = 0.000000000e+000 infeas = 0.000e+000 (0)
* 11: obj = 3.018990140e+001 infeas = 0.000e+000 (0)
OPTIMAL SOLUTION FOUND
GLPK Integer Optimizer, v4.47
1 row, 10 columns, 10 non-zeros
10 integer variables, all of which are binary
Integer optimization begins...
+ 11: mip = not found yet <= +inf (1; 0)
+ 20: >>>>> 2.700000000e+001 <= 2.900000000e+001 7.4% (10; 0)
+ 21: >>>>> 2.900000000e+001 <= 2.900000000e+001 0.0% (7; 3)
+ 21: mip = 2.900000000e+001 <= tree is empty 0.0% (0; 19)
INTEGER OPTIMAL SOLUTION FOUND
<!SOLVER MSG> ----
solution(res)
[1] 1 1 1 1 1 1 0 1 1 1
In an effort to create a “modern” approach to building things in R, the ompr package allows for model building through pipes.
We will need the following packages:
library(dplyr)
library(ompr)
library(ompr.roi)
First, we specify that we are creating a mixed integer model with MIPModel():
model = MIPModel()
Next, we add variables to the model. Again, these represent our 10 possible items:
model = MIPModel() %>%
add_variable(x[i], i = 1:n, type = "binary")
Now we need to specify our objective function:
model = MIPModel() %>%
add_variable(x[i], i = 1:n, type = "binary") %>%
set_objective(sum_expr(v[i] * x[i], i = 1:n), sense = "max")
That sum_expr function is essentially giving us the same thing as sumproduct.
After setting our objective, we can add our constraints:
model = MIPModel() %>%
add_variable(x[i], i = 1:n, type = "binary") %>%
set_objective(sum_expr(v[i] * x[i], i = 1:n), sense = "max") %>%
add_constraint(sum_expr(w[i] * x[i], i = 1:n) <= W)
We want the combined weight of our items to be less than or equal to W (which is still 20 pounds).
Finally, we can solve the problem with solve_model and use get_solution to see the results:
model = MIPModel() %>%
add_variable(x[i], i = 1:n, type = "binary") %>%
set_objective(sum_expr(v[i] * x[i], i = 1:n), sense = "max") %>%
add_constraint(sum_expr(w[i] * x[i], i = 1:n) <= W) %>%
solve_model(with_ROI("glpk", verbose = TRUE))
<SOLVER MSG> ----
GLPK Simplex Optimizer, v4.47
1 row, 10 columns, 10 non-zeros
* 0: obj = 0.000000000e+000 infeas = 0.000e+000 (0)
* 11: obj = 3.018990140e+001 infeas = 0.000e+000 (0)
OPTIMAL SOLUTION FOUND
GLPK Integer Optimizer, v4.47
1 row, 10 columns, 10 non-zeros
10 integer variables, all of which are binary
Integer optimization begins...
+ 11: mip = not found yet <= +inf (1; 0)
+ 20: >>>>> 2.700000000e+001 <= 2.900000000e+001 7.4% (10; 0)
+ 21: >>>>> 2.900000000e+001 <= 2.900000000e+001 0.0% (7; 3)
+ 21: mip = 2.900000000e+001 <= tree is empty 0.0% (0; 19)
INTEGER OPTIMAL SOLUTION FOUND
<!SOLVER MSG> ----
get_solution(model, x[i])
variable i value
1 x 1 1
2 x 2 1
3 x 3 1
4 x 4 1
5 x 5 1
6 x 6 1
7 x 7 0
8 x 8 1
9 x 9 1
10 x 10 1
Excel time.
Despite living in a world that really likes lineararity, we see nonlinear trends/functions everywhere (magnetic field strength, stock portfolio optimization, stress and productivity, etc.). Nonlinear programming (NLP) embraces this nonlinear world and allows for problems to break free from straight lines. But…what makes a problem nonlinear?
Let’s start by demonstrating a linear equation. If we have an equation \(3+2_{x_1}+7_{x_2}\) and supply numbers for \(x_1\) (1:10) and \(x_2\) (11:20), we would get the following line:
It is that same old song and dance that for every unit increase in x, we have a proportional increase in y.
Nonlinear functions behave differently. If we take our previous equation and tweak it by adding a higher order term (\(3+{x_1}^3+7_{x_2}\)), we get the following:
We no longer have a proportional increase.
NLP comes into play when any part of the problem, objective function and/or constraints, is nonlinear.
The previous image shows us a really nice 2d version of a nonlinear trend. For the vast majority of our statistical and mathematical models, we should really be observing them in multidimension space. Figuring out why nonlinear problems are so hard to find optimal solutions for make more sense once we map the models to 3 dimensions.
We can see that any 3d surface has multiple local optima, but there can be only one global optima.
In NLP, we might not actually find the global optima; instead, we may find any number of our local optima. Why can’t we find the global optima – our starting point is one source of blame. When we (or our program) selects a starting point within the surface, the algorithms will transverse through the steepest line it can find until it stops seeing incremental change. This is further complicated by the fact that the global optima could be anywhere within the feasible region of a nonlinear problem (not just at the extreme edges like our linear programming problem) and even further complicated by the ability to have disjointed feasible regions in our nonlinear problem.
You can breathe a sigh of relief – we won’t be messing around with R for NLP. It is not because R won’t do it (it can and does very well), but it is very coding heavy.
Let’s use the following equation:
\[y = x^2 -4x + 20 \]
We might want to try 3 different things: minimize y, y = 80, and y is maximized within a lower and upper bound of 0 and 10 for x.
We can go to Excel to find the solution to our problem, but we could almost pick out how to best minimize the function just by looking at the graph.
Either way, we get values of y = 16 and x = 2 when y is minimized.
Let’s see what it looks like when we set y = 80.
You can get two possible answers for that one.
And finally, what happens when we maximize with our constraints on x?
With the conceptual stuff out of the way, let’s look at a few real examples.